An algorithmic theory of caches
نویسنده
چکیده
The ideal-cache model, an extension of the RAM model, evaluates the referential locality exhibited by algorithms. The ideal-cache model is characterized by two parameters-the cache size Z, and line length L. As suggested by its name, the ideal-cache model practices automatic, optimal, omniscient replacement algorithm. The performance of an algorithm on the ideal-cache model consists of two measures-the RAM running time, called work complexity, and the number of misses on the ideal cache, called cache complexity. This thesis proposes the ideal-cache model as a "bridging" model for caches in the sense proposed by Valiant [49]. A bridging model for caches serves two purposes. It can be viewed as a hardware "ideal" that influences cache design. On the other hand, it can be used as a powerful tool to design cache-efficient algorithms. This thesis justifies the proposal of the ideal-cache model as a bridging model for caches by presenting theoretically sound caching mechanisms closely emulating the ideal-cache model and by presenting portable cache-efficient algorithms, called cache-oblivious algorithms. In Chapter 2, we consider a class of caches, called random-hashed caches, which have perfectly random line-placement functions. We shall look at analytical tools to compute the expected conflict-miss overhead for random-hashed caches, with respect to fully associative LRU caches. Specifically, we obtain good upper bounds on the conflict-miss overhead for random-hashed set-associative caches. We shall then consider the augmentation of fully associative victim caches [33], and find out that conflict-miss overhead reduces dramatically for well-sized victim caches. An interesting contribution of Chapter 2 is that victim caches need not be fully associative. Random-hashed set-associative victim caches perform nearly as well as fully associative victim caches. Chapter 3 introduces the notion of cache-obliviousness. Cache-oblivious algorithms are algorithms that do not require knowledge of the parameters of the ideal cache, namely Z and L. A cache-oblivious algorithm is said to be optimal if it has asymptotically optimal work and cache complexity, when compared to the best cache-aware algorithm, on any ideal-cache. Chapter 3 describes optimal cache-oblivious algorithms for matrix transposition, FFT, and sorting. For an ideal cache with Z = Q(L 2 ), the number of cache misses for an rn x n matrix transpose is 0(1 +m In/L). The number of cache misses for either an n-point FFT or the sorting of n numbers is 0(1 + (n/L)(1 + logzn)). Chapter 3 also proposes a 0(mnp)-work algorithm to multiply an rn x n matrix by an n x p matrix that incurs 0(1 + (in + np + mp)/L + mnp/L Z) cache faults. All the proposed algorithms rely on recursion, suggesting divide-and-conquer as a powerful paradigm for cache-oblivious algorithm design. In Chapter 4, we shall see that optimal cache-oblivious algorithms, satisfying the "regularity condition," perform optimally on many platforms, such as multilevel memory hierarchies and probabilistic LRUcaches. Moreover optimal cache-oblivious algorithms can also be ported to some existing manual-replacement cache models, such as SUMH [8] and HMM [4], while maintaining optimality. Thesis Supervisor: Charles E. Leiserson Title: Professor of Computer Science and Engineering
منابع مشابه
Probabilistic Sufficiency and Algorithmic Sufficiency from the point of view of Information Theory
Given the importance of Markov chains in information theory, the definition of conditional probability for these random processes can also be defined in terms of mutual information. In this paper, the relationship between the concept of sufficiency and Markov chains from the perspective of information theory and the relationship between probabilistic sufficiency and algorithmic sufficien...
متن کاملInteractive Form-Generation in High-Performance Architecture Theory
Architecture as a designerly way of thinking and knowing is to interact with its environment. The manuscript is to speculate “interactive form-generation” based on high-performance architecture theory, and discuss the precursors and the potentials. The research aims to explore and determine the roots, aspects of interactive architecture as a part of performance-based design in contemporary arch...
متن کاملParallel Genetic Algorithm Using Algorithmic Skeleton
Algorithmic skeleton has received attention as an efficient method of parallel programming in recent years. Using the method, the programmer can implement parallel programs easily. In this study, a set of efficient algorithmic skeletons is introduced for use in implementing parallel genetic algorithm (PGA).A performance modelis derived for each skeleton that makes the comparison of skeletons po...
متن کاملParallel Genetic Algorithm Using Algorithmic Skeleton
Algorithmic skeleton has received attention as an efficient method of parallel programming in recent years. Using the method, the programmer can implement parallel programs easily. In this study, a set of efficient algorithmic skeletons is introduced for use in implementing parallel genetic algorithm (PGA).A performance modelis derived for each skeleton that makes the comparison of skeletons po...
متن کاملThe Role of Algorithmic Applications in the Development of Architectural Forms (Case Study:Nine High-Rise Buildings)
The process of developing architectural forms has greatly been changed by advances in digital technology, especially in design tools and applications. In recent years, the advent of graphical scripting languages in the design process has profoundly affected 3D modeling. Scripting languages help develop algorithms and geometrical grammar of shapes based on their constituent parameters. This stud...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014